Learning the Latent "Look": Unsupervised Discovery of a Style-Coherent Embedding from Fashion Images
نویسندگان
چکیده
What defines a visual style? Fashion styles emerge organically from how people assemble outfits of clothing, making them difficult to pin down with a computational model. Low-level visual similarity can be too specific to detect stylistically similar images, while manually crafted style categories can be too abstract to capture subtle style differences. We propose an unsupervised approach to learn a style-coherent representation. Our method leverages probabilistic polylingual topic models based on visual attributes to discover a set of latent style factors. Given a collection of unlabeled fashion images, our approach mines for the latent styles, then summarizes outfits by how they mix those styles. Our approach can organize galleries of outfits by style without requiring any style labels. Experiments on over 100K images demonstrate its promise for retrieving, mixing, and summarizing fashion images by their style.
منابع مشابه
Learning the Latent “Look”: Unsupervised Discovery of a Style-Coherent Embedding from Fashion Images (Supplementary material)
In our main paper in Section 4.1, we show top images for the five discovered topics on the HipsterWars dataset [1] using our method; here, we show the corresponding top images for the StyleNet CNN [3] and attribute clusters. Figure 1a are results for CNN clusters: Goth, Bohemian, and Preppy styles are succesfully discovered; however, there are 2 variants of Goth, one for pants and one for skirt...
متن کاملStyle2Vec: Representation Learning for Fashion Items from Style Sets
With the rapid growth of online fashion market, demand for effective fashion recommendation systems has never been greater. In fashion recommendation, the ability to find items that goes well with a few other items based on style is more important than picking a single item based on the user’s entire purchase history. Since the same user may have purchased dress suits in one month and casual de...
متن کاملPixelGAN Autoencoders
In this paper, we describe the “PixelGAN autoencoder”, a generative autoencoder in which the generative path is a convolutional autoregressive neural network on pixels (PixelCNN) that is conditioned on a latent code, and the recognition path uses a generative adversarial network (GAN) to impose a prior distribution on the latent code. We show that different priors result in different decomposit...
متن کاملDiscovering Multi-relational Latent Attributes by Visual Similarity Networks
The key problems in visual object classification are: learning discriminative feature to distinguish between two or more visually similar categories ( e.g. dogs and cats), modeling the variation of visual appearance within instances of the same class (e.g. Dalmatian and Chihuahua in the same category of dogs), and tolerate imaging distortion (3D pose). These account to within and between class ...
متن کاملLearning Latent Representations for Speech Generation and Transformation
An ability to model a generative process and learn a latent representation for speech in an unsupervised fashion will be crucial to process vast quantities of unlabelled speech data. Recently, deep probabilistic generative models such as Variational Autoencoders (VAEs) have achieved tremendous success in modeling natural images. In this paper, we apply a convolutional VAE to model the generativ...
متن کامل